OpenCV fundamentals

This notebook covers opening files, looking at pixels, and some simple image processing techniques.

We'll use the following sample image, stolen from the Internet. But you can use whatever image you like.

If you can't see an image above then you haven't got the full tutorial code from github. In the same directory as this notebook you should also have the following files:

1 Getting started notebook.ipynb
2 Fundamentals.ipynb
3 Image stats and image processing.ipynb
4 Features.ipynb
5 detecting faces and other things.ipynb
6 Moving away from the notebook.ipynb
cheat.py
common.py
common.pyc
edgedemo.png
haarcascade_frontalface_default.xml
LICENSE
noidea.jpg
play.py
README.md
start.py
test.jpg
video.py
video.pyc

If you haven't got them, make sure you've got the whole repo from [https://github.com/handee/opencv-gettingstarted]

Python getting started

First we need to import the relevant libraries: OpenCV itself, Numpy, and a couple of others. Common and Video are simple data handling and opening routines that you can find in the OpenCV Python Samples directory or from the github repo linked above. We'll start each notebook with the same includes - you don't need all of them every time (so this is bad form, really) but it's easier to just copy and paste.


In [1]:
# these imports let you use opencv
import cv2 #opencv itself
import common #some useful opencv functions
import video # some video stuff
import numpy as np # matrix manipulations

#the following are to do with this interactive notebook code
%matplotlib inline 
from matplotlib import pyplot as plt # this lets you draw inline pictures in the notebooks
import pylab # this allows you to control figure size 
pylab.rcParams['figure.figsize'] = (10.0, 8.0) # this controls figure size in the notebook

Now we can open an image:


In [2]:
input_image=cv2.imread('noidea.jpg')

We can find out various things about that image


In [3]:
print input_image.size


776250

In [6]:
print input_image.shape


(414, 625, 3)

In [21]:
print input_image.dtype


uint8

gotcha that last one (datatype) is one of the tricky things about working in Python. As it's not strongly typed, Python will allow you to have arrays of different types but the same size, and some functions will return arrays of types that you probably don't want. Being able to check and inspect the datatype like this is very useful and is one of the things I often find myself doing in debugging.


In [22]:
plt.imshow(input_image)


Out[22]:
<matplotlib.image.AxesImage at 0x7f0c84204490>

What this illustrates is something key about OpenCV: it doesn't store images in RGB format, but in BGR format.


In [23]:
# split channels
b,g,r=cv2.split(input_image)
# show one of the channels (this is red - see that the sky is kind of dark. try changing it to b)
plt.imshow(r, cmap='gray')


Out[23]:
<matplotlib.image.AxesImage at 0x7f0c84144510>

converting between colour spaces, merging and splitting channels

We can convert between various colourspaces in OpenCV easily. We've seen how to split, above. We can also merge channels:


In [24]:
merged=cv2.merge([r,g,b])
# merge takes an array of single channel matrices
plt.imshow(merged)


Out[24]:
<matplotlib.image.AxesImage at 0x7f0c84078810>

OpenCV also has a function specifically for dealing with image colorspaces, so rather than split and merge channels by hand you can use this instead. It is usually marginally faster...

There are something like 250 color related flags in OpenCV for conversion and display. The ones you are most likely to use are COLOR_BGR2RGB for RGB conversion, COLOR_BGR2GRAY for conversion to greyscale, and COLOR_BGR2HSV for conversion to Hue,Saturation,Value colour space. [http://docs.opencv.org/trunk/de/d25/imgproc_color_conversions.html] has more information on how these colour conversions are done.


In [25]:
COLORflags = [flag for flag in dir(cv2) if flag.startswith('COLOR') ]
print len(COLORflags)
# print COLORflags 
# if you want to see them all, rather than just a count


271

In [26]:
opencv_merged=cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)
plt.imshow(opencv_merged)


Out[26]:
<matplotlib.image.AxesImage at 0x7f0c7df90c50>

Getting image data and setting image data

Images in python OpenCV are numpy arrays. Numpy arrays are optimised for fast array operations and so there are usually fast methods for doing array calculations which don't actually involve writing all the detail yourself. So it's usually bad practice to access individual pixels, but you can.


In [27]:
pixel = input_image[100,100]
print pixel


[150 161 153]

In [28]:
input_image[100,100] = [0,0,0]
pixelnew = input_image[100,100]
print pixelnew


[0 0 0]

Getting and setting regions of an image

In the same way as we can get or set individual pixels, we can get or set regions of an image. This is a particularly useful way to get a region of interest to work on.


In [29]:
dogface = input_image[60:250, 70:350]
plt.imshow(dogface)


Out[29]:
<matplotlib.image.AxesImage at 0x7f0c7dec4f90>

In [30]:
fresh_image=cv2.imread('noidea.jpg') # it's either start with a fresh read of the image, 
                                  # or end up with dogfaces on dogfaces on dogfaces 
                                   # as you re-run parts of the notebook but not others... 
                            
fresh_image[200:200+dogface.shape[0], 200:200+dogface.shape[1]]=dogface
print dogface.shape[0]
print dogface.shape[1]
plt.imshow(fresh_image)


190
280
Out[30]:
<matplotlib.image.AxesImage at 0x7f0c7ddfc750>

Matrix slicing

In OpenCV python style, as I have mentioned, images are numpy arrays. There are some superb array manipulation in numpy tutorials out there: this is a great introduction if you've not done it before [http://www.scipy-lectures.org/intro/numpy/numpy.html#indexing-and-slicing]. The getting and setting of regions above uses slicing, though, and I'd like to finish this notebook with a little more detail on what is going on there.


In [31]:
freshim2 = cv2.imread("noidea.jpg")
crop = freshim2[100:400, 130:300] 
plt.imshow(crop)


Out[31]:
<matplotlib.image.AxesImage at 0x7f0c7ddb48d0>

The key thing to note here is that the slicing works like

[top_y:bottom_y, left_x:right_x]

This can also be thought of as

[y:y+height, x:x+width]

You can also use slicing to separate out channels. In this case you want

[y:y+height, x:x+width, channel]

where channel represents the colour you're interested in - this could be 0 = blue, 1 = green or 2=red if you're dealing with a default OpenCV image, but if you've got an image that has been converted it could be something else. Here's an example that converts to HSV then selects the S (Saturation) channel of the same crop above:


In [32]:
hsvim=cv2.cvtColor(freshim2,cv2.COLOR_BGR2HSV)
bcrop =hsvim[100:400, 100:300, 1]
plt.imshow(bcrop, cmap="gray")


Out[32]:
<matplotlib.image.AxesImage at 0x7f0c7dcdee10>

In [ ]: